Restructuring View Maintenance Plans for Large Update
نویسندگان
چکیده
Materialized views defined over distributed data sources are a well recognized technology in data integration, e-business and semantic web. Due to the constantly increasing size of the information sources and the rapid rates of change, there comes an increasing pressure to reduce the time taken for refreshing such integration views. State-of-the-art incremental view maintenance literature requires O(n) (batch view maintenance) or more (i.e., sequential maintenance) maintenance queries with n is the number of data sources. In this work, we optimize the maintenance performance by restructuring the batch view maintenance plan to reduce the number of maintenance queries to remote data sources when maintaining a large set of updates. We first propose an adjacent grouping strategy which exploits the regularity in the batch maintenance plan. This solution reduces the number maintenance queries by sharing the common accesses to data sources. Then we propose a conditional grouping approach which reduces the number of remote queries to O(n) by unifying heterogeneous subexpressions (deltas). A cost model to analyze these approaches is provided. The proposed maintenance strategies have been implemented in our TxnWrap system. Experimental studies illustrate that our conditional grouping algorithm has about 300% performance improvement in terms of total processing time compared with existing batch algorithms in a major part of cases. Our experiments also reveal an additional dimension of this design space, namely the impact of the cooperation of the remote sources on maintenance performance.
منابع مشابه
Incremental Maintenance of Schema-Restructuring Views
An important issue in data integration is the integration of semantically equivalent but schematically heterogeneous data sources. Declarative mechanisms supporting powerful source restructuring for such databases have been proposed in the literature, such as the SQL extension SchemaSQL. However, the issue of incremental maintenance of views defined in such languages remains an open problem. We...
متن کاملMaintaining large update batches by restructuring and grouping
Materialized views defined over distributed data sources can be utilized by many applications to ensure better access, reliable performance, and high availability. Technology for maintaining materialized views is thus critical for providing upto-date results since a stale view extent may not help or even mislead these applications. State-of-the-art incremental view maintenance requires OðnÞ or ...
متن کاملافزایش سرعت نگهداری افزایشی دید با استفاده از الگوریتم فاخته
Data warehouse is a repository of integrated data that is collected from various sources. Data warehouse has a capability of maintaining data from various sources in its view form. So, the view should be maintained and updated during changes of sources. Since the increase in updates may cause costly overhead, it is necessary to update views with high accuracy. Optimal Delta Evaluation method is...
متن کاملPeriodic flexible maintenance planning in a single-machine production environment
Preventive maintenance is the essential part of many maintenance plans. From the production point of view, the flexibility of the maintenance intervals enhances the manufacturing efficiency. On the contrary, the maintenance departments tend to know the timing of the long term maintenance plans as certain as possible. In a single-machine production environment, this paper proposes a simulation–o...
متن کاملView Maintenance in Web Data Platforms
Modern Web Data Platforms (WDPs) handle large amount of data and activity through massively distributed infrastructures. To achieve performance and availability at Internet scale, WDPs restrict querying capability, and provide weaker consistency guarantees than traditional ACID transactions. The sheer volume of parallel processing without ACID transaction guarantees, and the large number of ind...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003